424 research outputs found

    Large-scale log analysis of digital reading

    Get PDF
    In this paper, we address daily reading practices of the general public in Russia analyzing 10 months of log data from the commercial ebook site Bookmate. We study different reading characteristics with ebooks, i.e. the reading volume and preferences, reading schedule, reading speed and reading style (including parallel reading patterns and book abandonment rates), with respect to reader gender, book length and genre of the book. We find that book genres impact certain reading behaviors, while gender differences or book length seem to play less of a role in ebook reading. Parallel book reading and book abandonment occur very frequently, possibly pointing towards changing reading behaviors in the ebook environment. The obtained insights demonstrate the high potential of log analysis for book reading studies. Copyright © 2016 by Association for Information Science and Technolog

    Protein Models: The Grand Challenge of protein docking

    Get PDF
    Characterization of life processes at the molecular level requires structural details of protein–protein interactions (PPIs). The number of experimentally determined protein structures accounts only for a fraction of known proteins. This gap has to be bridged by modeling, typically using experimentally determined structures as templates to model related proteins. The fraction of experimentally determined PPI structures is even smaller than that for the individual proteins, due to a larger number of interactions than the number of individual proteins, and a greater difficulty of crystallizing protein–protein complexes. The approaches to structural modeling of PPI (docking) often have to rely on modeled structures of the interactors, especially in the case of large PPI networks. Structures of modeled proteins are typically less accurate than the ones determined by X-ray crystallography or nuclear magnetic resonance. Thus the utility of approaches to dock these structures should be assessed by thorough benchmarking, specifically designed for protein models. To be credible, such benchmarking has to be based on carefully curated sets of structures with levels of distortion typical for modeled proteins. This article presents such a suite of models built for the benchmark set of the X-ray structures from the Dockground resource (http://dockground.bioinformatics.ku.edu) by a combination of homology modeling and Nudged Elastic Band method. For each monomer, six models were generated with predefined Cα root mean square deviation from the native structure (1, 2, . . ., 6 Å). The sets and the accompanying data provide a comprehensive resource for the development of docking methodology for modeled proteins

    Structural templates for comparative protein docking

    Get PDF
    Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, non-redundant library of templates containing 4,950 full structures of binary complexes and 5,936 protein-protein interfaces extracted from the full structures at 12Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu

    Protein Model Docking Benchmark 2

    Get PDF
    Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have pre-defined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native Cα RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the “real case scenario,” as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu

    Simulated unbound structures for benchmarking of protein docking in the dockground resource

    Get PDF
    Background Proteins play an important role in biological processes in living organisms. Many protein functions are based on interaction with other proteins. The structural information is important for adequate description of these interactions. Sets of protein structures determined in both bound and unbound states are essential for benchmarking of the docking procedures. However, the number of such proteins in PDB is relatively small. A radical expansion of such sets is possible if the unbound structures are computationally simulated. Results The dockground public resource provides data to improve our understanding of protein–protein interactions and to assist in the development of better tools for structural modeling of protein complexes, such as docking algorithms and scoring functions. A large set of simulated unbound protein structures was generated from the bound structures. The modeling protocol was based on 1 ns Langevin dynamics simulation. The simulated structures were validated on the ensemble of experimentally determined unbound and bound structures. The set is intended for large scale benchmarking of docking algorithms and scoring functions. Conclusions A radical expansion of the unbound protein docking benchmark set was achieved by simulating the unbound structures. The simulated unbound structures were selected according to criteria from systematic comparison of experimentally determined bound and unbound structures. The set is publicly available at http://dockground.compbio.ku.edu

    Mass Spectrometry Based Molecular 3D-Cartography of Plant Metabolites

    Get PDF
    Plants play an essential part in global carbon fixing through photosynthesis and are the primary food and energy source for humans. Understanding them thoroughly is therefore of highest interest for humanity. Advances in DNA and RNA sequencing and in protein and metabolite analysis allow the systematic description of plant composition at the molecular level. With imaging mass spectrometry, we can now add a spatial level, typically in the micrometer-to-centimeter range, to their compositions, essential for a detailed molecular understanding. Here we present an LC-MS based approach for 3D plant imaging, which is scalable and allows the analysis of entire plants. We applied this approach in a case study to pepper and tomato plants. Together with MS/MS spectra library matching and spectral networking, this non-targeted workflow provides the highest sensitivity and selectivity for the molecular annotations and imaging of plants, laying the foundation for studies of plant metabolism and plant-environment interactions

    Quantum algorithm and circuit design solving the Poisson equation

    Get PDF
    The Poisson equation occurs in many areas of science and engineering. Here we focus on its numerical solution for an equation in d dimensions. In particular we present a quantum algorithm and a scalable quantum circuit design which approximates the solution of the Poisson equation on a grid with error \varepsilon. We assume we are given a supersposition of function evaluations of the right hand side of the Poisson equation. The algorithm produces a quantum state encoding the solution. The number of quantum operations and the number of qubits used by the circuit is almost linear in d and polylog in \varepsilon^{-1}. We present quantum circuit modules together with performance guarantees which can be also used for other problems.Comment: 30 pages, 9 figures. This is the revised version for publication in New Journal of Physic

    A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries

    Full text link
    Interactive query expansion can assist users during their query formulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re-commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain-specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out-performed all other implementations. All our observations show that domain-specific vocabulary can support the user in finding alternative concepts and formulating queries.Comment: To be published in Proceedings of Theories and Practice in Digital Libraries (TPDL), 201

    Science Models as Value-Added Services for Scholarly Information Systems

    Full text link
    The paper introduces scholarly Information Retrieval (IR) as a further dimension that should be considered in the science modeling debate. The IR use case is seen as a validation model of the adequacy of science models in representing and predicting structure and dynamics in science. Particular conceptualizations of scholarly activity and structures in science are used as value-added search services to improve retrieval quality: a co-word model depicting the cognitive structure of a field (used for query expansion), the Bradford law of information concentration, and a model of co-authorship networks (both used for re-ranking search results). An evaluation of the retrieval quality when science model driven services are used turned out that the models proposed actually provide beneficial effects to retrieval quality. From an IR perspective, the models studied are therefore verified as expressive conceptualizations of central phenomena in science. Thus, it could be shown that the IR perspective can significantly contribute to a better understanding of scholarly structures and activities.Comment: 26 pages, to appear in Scientometric
    corecore